Skip to content

test(conformance): arm the fixtures for 2026-07-28 serving and refresh the expected-failures baseline#2310

Merged
felixweinberger merged 7 commits into
v2-2026-07-28from
fweinberger/conformance-2026-arming
Jun 16, 2026
Merged

test(conformance): arm the fixtures for 2026-07-28 serving and refresh the expected-failures baseline#2310
felixweinberger merged 7 commits into
v2-2026-07-28from
fweinberger/conformance-2026-arming

Conversation

@felixweinberger

@felixweinberger felixweinberger commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Serve the 2026-07-28 protocol revision from the conformance fixture and gate it in CI

Motivation and Context

The conformance suite can already exercise the 2026-07-28 draft revision — draft scenarios
run over the stateless per-request lifecycle, the runner tells client fixtures the resolved
version via MCP_CONFORMANCE_PROTOCOL_VERSION, and --spec-version can force the
carried-forward scenarios onto the new revision — but our fixture only spoke the 2025
stateful lifecycle and our legs never passed --spec-version, so the serving stack now on
v2-2026-07-28 had no referee. This wires the fixture to the new entry point, adds
spec-version-forced legs, and gates everything in CI as a three-way split:

  • 2025 legs (constant, gated): test:conformance:server (active) and
    test:conformance:client:all keep the shared baseline (expected-failures.yaml) and stay
    green with no regressions.
  • 2026 draft suite: test:conformance:server:draft exercises the draft-only scenarios
    against the new serving path; its remaining failures live in the shared baseline and burn
    down as the SDK gaps close.
  • 2026 carried-forward legs (new): test:conformance:server:2026 and
    test:conformance:client:2026 run --suite all --spec-version 2026-07-28 so the
    carried-forward scenarios are also exercised at the new revision. Both are gated against
    one new shared baseline (expected-failures.2026-07-28.yaml, client:/server: sections
    like the existing file) because the name-keyed shared baseline cannot express
    version-split outcomes.

This also bumps the pinned @modelcontextprotocol/conformance release from 0.2.0-alpha.3 to
0.2.0-alpha.4: its mock servers now include resultType in results, which the SDK's
2026-07-28 client decode requires, so the carried-forward client scenarios can complete at
the new revision.

What changed

  • everythingServer.ts: requests claiming the per-request _meta envelope (including
    server/discover and malformed claims) are served through createMcpHandler, backed by
    the same fixture server definition; everything else stays on the existing stateful session
    path unchanged.
  • everythingClient.ts: reads MCP_CONFORMANCE_PROTOCOL_VERSION; tools_call speaks the
    2026 lifecycle on 2026-07-28 runs (version negotiation + per-request _meta); new
    request-metadata handler driven by versionNegotiation: { mode: 'auto' }.
  • New scripts test:conformance:server:2026 and test:conformance:client:2026
    (--suite all --spec-version 2026-07-28), plus one new step in each of the two jobs in
    .github/workflows/conformance.yml to run them.
  • New baseline expected-failures.2026-07-28.yaml shared by the two new legs, every entry
    with a reason comment, grouped by what unblocks it (15 server entries, 25 client entries).
  • expected-failures.yaml: request-metadata, caching,
    http-custom-header-server-validation, and server-stateless now pass and are removed;
    the header and the remaining entries' comments are updated to the current pin and the
    settled -32001/-32602 rejection codes.
  • @modelcontextprotocol/conformance pin 0.2.0-alpha.3 → 0.2.0-alpha.4 (lockfile change
    scoped to that package).
  • One client test name updated to stop describing the -32001 assignment as pending an
    upstream decision (assertions unchanged).

How Has This Been Tested?

  • Against the 0.2.0-alpha.4 pin, all legs exit 0 with zero stale baseline entries:
    test:conformance:server 42/0 (unchanged), test:conformance:server:draft 6→39 passed
    checks (14 expected-failure scenarios remain), test:conformance:client:all 317→324 passed
    checks (15 remain), and the new legs test:conformance:server:2026 63 passed / 18 failed
    checks and test:conformance:client:2026 207 passed / 36 failed checks, with every failure
    covered by the new baseline.
  • The new legs gate the same way as the existing ones: stale baseline entries and unexpected
    failures each make the leg exit non-zero (observed directly while burning down the
    baseline entries that started passing).
  • pnpm -r typecheck, lint:all, docs:check, and the client package test suite pass;
    pnpm install --frozen-lockfile succeeds against the updated lockfile.

Breaking Changes

None — test fixture, CI wiring, baselines, and a dev-dependency pin bump in a private
package.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update

Checklist

  • I have read the MCP Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have added or updated documentation as needed

Additional context

Remaining expected-failure groups, by what unblocks them:

  • Multi-round-trip requests (SEP-2322): the input-required-result-* family (server, in both
    the draft suite and the 2026 carried-forward leg) and sep-2322-client-request-state.
  • SEP-2243 header enforcement in the SDK: the remaining http-header-validation reject cells
    (missing Mcp-Method/Mcp-Name headers and the Mcp-Name cross-check) and the client
    header scenarios.
  • SEP-2164 error semantics, the client auth scenarios (SEP-2468 / SEP-2352 / SEP-990), and
    SEP-2106 $ref handling: same entries as before, mirrored into the 2026 client section
    where those scenarios also run.
  • SEP-837 application_type during DCR plus the 2026 lifecycle in the fixture's auth flow:
    the auth entries that only fail on the forced-2026 client leg.
  • json-schema-2020-12 (server 2026 leg only): pre-existing fixture/baseline issue that
    fails identically at 2025 in --suite all; not a 2026-path regression.

http-custom-header-server-validation passes with all checks skipped (the fixture has no
custom-header-annotated tool yet); the SEP-2243 reject cells remain covered by the
http-header-validation entry.

@felixweinberger felixweinberger requested a review from a team as a code owner June 16, 2026 14:56
@pkg-pr-new

pkg-pr-new Bot commented Jun 16, 2026

Copy link
Copy Markdown

Open in StackBlitz

@modelcontextprotocol/client

npm i https://pkg.pr.new/modelcontextprotocol/typescript-sdk/@modelcontextprotocol/client@2310

@modelcontextprotocol/codemod

npm i https://pkg.pr.new/modelcontextprotocol/typescript-sdk/@modelcontextprotocol/codemod@2310

@modelcontextprotocol/server

npm i https://pkg.pr.new/modelcontextprotocol/typescript-sdk/@modelcontextprotocol/server@2310

@modelcontextprotocol/server-legacy

npm i https://pkg.pr.new/modelcontextprotocol/typescript-sdk/@modelcontextprotocol/server-legacy@2310

@modelcontextprotocol/express

npm i https://pkg.pr.new/modelcontextprotocol/typescript-sdk/@modelcontextprotocol/express@2310

@modelcontextprotocol/fastify

npm i https://pkg.pr.new/modelcontextprotocol/typescript-sdk/@modelcontextprotocol/fastify@2310

@modelcontextprotocol/hono

npm i https://pkg.pr.new/modelcontextprotocol/typescript-sdk/@modelcontextprotocol/hono@2310

@modelcontextprotocol/node

npm i https://pkg.pr.new/modelcontextprotocol/typescript-sdk/@modelcontextprotocol/node@2310

commit: 7aec2d4

@changeset-bot

changeset-bot Bot commented Jun 16, 2026

Copy link
Copy Markdown

⚠️ No Changeset found

Latest commit: 7aec2d4

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

…ture

Route modern-classified requests (per-request _meta envelope, server/discover)
through a createMcpHandler entry backed by the same fixture server definition;
legacy-classified traffic stays on the existing stateful 2025 session path
unchanged. Teach the everything client to read
MCP_CONFORMANCE_PROTOCOL_VERSION, negotiate the modern era with
versionNegotiation on 2026-07-28 runs, and handle the request-metadata
scenario.

Expected-failures burn-down (entries now passing, removed):
- request-metadata (client): the server/discover negotiation probe satisfies
  the SEP-2575 header/_meta/retry-on--32004 checks
- caching (server): 2026-era list/read results now carry ttlMs/cacheScope
- http-custom-header-server-validation (server): every check is SKIPPED
  because the fixture registers no x-mcp-header-annotated tool; the disputed
  SEP-2243 error-code cells stay covered by the http-header-validation entry

Comments on the remaining draft-suite entries updated to name what actually
blocks them now that the 2026-07-28 path is served (multi-round-trip requests,
the disputed error-code cells, error-id echo, removed-method handling).
…rged serving stack

Re-ran the burn-down on the integration branch tip. The error-id-echo cells
and the enveloped-initialize removed-method cell in server-stateless now pass
(error responses echo the request JSON-RPC id; an initialize carrying a valid
2026 envelope is answered 404/-32601), so the entry's note no longer lists
them as blockers. No entries are removed: server-stateless is still held by
the disputed envelope/header error-code cells pending conformance #336, and
every other entry still fails for the reason already recorded.
…26-07-28

The existing legs never pass --spec-version, so carry-forward scenarios were
only exercised at their default 2025 version. Add one leg per direction that
runs --suite all --spec-version 2026-07-28 against its own expected-failures
file (the shared name-keyed baseline cannot express version-split outcomes),
and wire both legs into the conformance workflow.

- test:conformance:server:2026 - 54 passed / 21 failed checks; baseline
  expected-failures.2026-07-28.yaml (16 scenarios: the same failures as the
  draft-suite leg plus json-schema-2020-12, which fails identically at 2025).
- test:conformance:client:2026 - 206 passed / 37 failed checks; baseline
  expected-failures.client.2026-07-28.yaml (26 scenarios: tools_call blocked
  by the referee mocks omitting resultType (fixed upstream, unblocks at the
  next published conformance release), the SEP-837 application_type check
  that only fires on draft-version runs, the auth scope-escalation scenarios
  cut short by the 2026 connection lifecycle, and the scenarios already
  baselined at 2025).

Both legs fail on unexpected failures and stale baseline entries, same as the
existing legs. The referee stays the published 0.2.0-alpha.3 pin.
…ed file

The 2025 legs share a single expected-failures.yaml with separate client: and
server: sections; mirror that shape for the 2026-07-28 carried-forward legs.
Merge expected-failures.client.2026-07-28.yaml into
expected-failures.2026-07-28.yaml (client: section added, entries and reasons
unchanged), delete the client-specific file, and point
test:conformance:client:2026 at the consolidated file. No entry changes.
…pha.4

0.2.0-alpha.4 makes the runner's mock servers include resultType in results,
which the SDK's 2026-07-28 client decode requires; this unblocks the
carried-forward client scenarios at the 2026 spec version. Lockfile change is
scoped to the conformance package.
With the rejection codes aligned to the referee (-32001 for header/body
mismatches, -32602 for a missing _meta envelope or protocolVersion key) and
the fixture serving the 2026-07-28 path, server-stateless passes fully
(21/21 checks) on the draft and 2026 server legs, so its entry leaves both
baselines. The 0.2.0-alpha.4 mock servers now include resultType in results,
which the SDK 2026 client decode requires, so tools_call passes on the 2026
client leg and leaves that baseline.

Also reconcile the shared baseline's header with the new pin (drop the
references to the previous release and to auth scenarios that the published
release now ships) and restate the http-header-validation reason in the 2026
baseline in terms of the settled codes: the cells still failing are the
missing-header and Mcp-Name cross-check ones, not the error-code cells.
…1 assignment

The -32001 ladder cell is no longer pending an upstream error-code decision:
it is the spec-assigned HeaderMismatch code. The probe classifier still never
treats it as modern evidence because deployed servers overload it for
session-not-found responses. Wording only; assertions unchanged.
@felixweinberger felixweinberger force-pushed the fweinberger/conformance-2026-arming branch from 815d374 to 7aec2d4 Compare June 16, 2026 18:11
@felixweinberger felixweinberger merged commit 5a4677f into v2-2026-07-28 Jun 16, 2026
14 checks passed
@felixweinberger felixweinberger deleted the fweinberger/conformance-2026-arming branch June 16, 2026 21:53
felixweinberger added a commit that referenced this pull request Jun 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant